Methods and systems for selective quantitation and detection of allergens including gly m 7

ABSTRACT

The invention relates to methods and systems taking advantage of bioinformatic investigations to identify candidate signature peptides for quantitative multiplex analysis of complex protein samples from plants, plant parts, and/or food products using mass spectroscopy. Provided are use and methods for selecting candidate signature peptides for quantitation using a bioinformatic approach. Also provided are systems comprising a chromatography and mass spectrometry for using selected signature peptides.

CROSS REFERENCE TO RELATED APPLICATION

This is a national phase entry under 35 U.S.C. § 371 of internationalpatent application PCT/US2018/14765, filed on Jan. 23, 2018 andpublished in English as international patent publication WO2018140370 onAug. 2, 2018, which claims priority to the benefit of U.S. ProvisionalPatent Application Ser. No. 62/450,246 filed Jan. 25, 2017 thedisclosure of which is hereby incorporated by reference in its entirety.

BACKGROUND OF THE INVENTION

The current methods for analysis of gene expression in plants that arepreferred in the art include DNA-based techniques (for example PCRand/or RT-PCR); the use of reporter genes; Southern blotting; andimmunochemistry. All of these methodologies suffer from variousshortcomings. Detection of known and potential allergens in plants,plant parts, and/or food products is an important subject for publicsafety.

Although mass spectrometry has been disclosed previously, existingapproaches are limited without selected and sensitive quantitation.There remains a need for a high-throughput method for selected andsensitive quantitation of known and/or potential allergens in plant,plant parts, and/or food products.

SUMMARY OF THE INVENTION

The invention relates to methods and systems taking advantage ofbioinformatic investigations to identify candidate signature peptidesfor quantitative multiplex analysis of complex protein samples fromplants, plant parts, and/or food products using mass spectrometry.Provided are use and methods for selecting candidate signature peptidesfor quantitation using a bioinformatic approach. Also provided aresystems comprising a chromatography and mass spectrometry for usingselected signature peptides.

In one aspect, provided is a method of selecting candidate signaturepeptide for quantitation of known allergen and potential allergens froma plant-based sample. The method comprises:

-   -   (a) identifying potential allergens based on homology to at        least one known allergen protein sequence;    -   (b) performing sequence alignment of the at least one known        allergen and potential allergens identified in step (a);    -   (c) selecting a consensus sequence or representative sequence        based on the sequence alignment;    -   (d) determining a plural of candidate signature peptides based        on conservative regions or domains from the sequence alignment        and in silico digestion data of the consensus sequence or        representative sequence selected in Step (c); and    -   (e) quantitating the amount of the at least one known allergen        and potential allergens in the plant-based sample based on        measurements of the signature peptides.

In one embodiment, the quantitating step uses a column chromatographyand mass spectrometry. In another embodiment, the quantitating stepcomprises measuring the plural of candidate signature peptides usinghigh resolution accurate mass spectrometry (RAM MS). In anotherembodiment, the quantitating step comprises calculating correspondingpeak heights or peak areas of the candidate signature peptides from massspectrometry. In another embodiment, the quantitating step comprisescomparing data from high fragmentation mode and low fragmentation modefrom mass spectrometry.

In one embodiment, the at least one known allergen comprises Gly m 7. Inanother embodiment, the at least one known allergen comprises at leastone allergen selected from the group consisting of Gly m 1, Gly m 3, Glym 4, Gly m 5 (beta-conglycinin), Gly m 6 (Glycinin) G1, Gly m 6(Glycinin) G2, Gly m 6 (Glycinin) G3, Gly m 6 (Glycinin) G4, Gly m 6(Glycinin) precursor, Gly m 6 (Glycinin) G4 precursor, Gly m 7, Kunitztrypsin inhibitor 1, Kunitz trypsin inhibitor 3, Gly m Bd 28 K, Gly m Bd30 K, Gly m 8 (2S albumin), Lectin, and lipoxygenase. In anotherembodiment, the potential allergens comprise at least one sequenceselected from SEQ ID NOs: 12-15. In another embodiment, the candidatesignature peptides comprise at least one sequence selected from SEQ IDNOs: 32-43. In another embodiment, the candidate signature peptidescomprise SEQ ID NO: 32, 33, 37, or 41. In another embodiment, theplant-based sample comprises a soybean seed or part of a soybean seed.

In another aspect, provided is a system for quantitating one or moreprotein of interest with known amino acid sequence in a plant-basedsample. The system comprises:

-   -   (a) a high-throughput means for extracting proteins from a        plant-based sample;    -   (b) a process module for digesting extracted proteins with at        least one protease;    -   (c) a separation module for separating peptides in a single        step;    -   (d) a selection module for selecting a plural of signature        peptides for at least one known allergen and potential        allergens; and    -   (e) a mass spectrometry for measuring the plural of signature        peptides.

In one embodiment, the separation module comprises a columnchromatography. In a further embodiment, the column chromatographycomprises a liquid column chromatography. In another embodiment, themass spectrometry comprises a high resolution accurate mass spectrometry(HRAM MS). In another embodiment, the selection module uses a methodprovided herein.

In one embodiment, the at least one known allergen comprises Gly m 7. Inanother embodiment, the at least one known allergen comprises at leastone allergen selected from the group consisting of Gly m 1, Gly m 3, Glym 4, Gly m 5 (beta-conglycinin), Gly m 6 (Glycinin) G1, Gly m 6(Glycinin) G2, Gly m 6 (Glycinin) G3, Gly m 6 (Glycinin) G4, Gly m 6(Glycinin) precursor, Gly m 6 (Glycinin) G4 precursor, Gly m 7, Kunitztrypsin inhibitor 1, Kunitz trypsin inhibitor 3, Gly m Bd 28 K, Gly m Bd30 K, Gly m 8 (2S albumin), Lectin, and lipoxygenase. In anotherembodiment, the potential allergens comprise at least one sequenceselected from SEQ ID NOs: 12-15. In another embodiment, the signaturepeptides comprise at least one sequence selected from SEQ ID NOs: 32-43.In another embodiment, the signature peptides comprise SEQ ID NO: 32,33, 37, or 41. In another embodiment, the plant-based sample comprises asoybean seed or part of a soybean seed.

In another aspect, provided is a high-throughput method of quantitatingat least one allergen with known amino acid sequence and homologouspotential allergens in a plant-based sample. The method comprises usingthe system provided herein.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows a representative analysis work flow for the methods andsystems disclosed herein.

FIGS. 2-13 show representative SRM LC-MS/MS for selected signaturepeptides SEQ ID NO: 32 AAELASMSAGAVK; SEQ ID NO: 33 AMGDIGGR; SEQ ID NO:34 DTPQGSIEALQAGER, SEQ ID NO: 35 DYTLQAAEK, SEQ ID NO: 36 GLAASAGETAK,SEQ ID NO: 37 QSWLETR, SEQ ID NO: 38 SAAGYAAK, SEQ ID NO: 39SAGGTTASYVGEK, SEQ ID NO: 40 SAWEQISNYSDQATQGVK, SEQ ID NO: 41 SLTSIGEK,SEQ ID NO: 42 TTAVITCTLEK, and SEQ ID NO: 43 VAADLR from soybean samplechromatogram.

FIG. 14 shows sequences alignments among potential homologs of Gly m 7.

DETAILED DESCRIPTION OF THE INVENTION

It is of significance to enable a sensitive multiplex assay that iscapable of selectively detecting and measuring levels of proteins ofinterest. Currently, relevant technologies for protein expressiondetection rely heavily on traditional immunochemistry technologies whichpresent a challenge to accommodate the volume of data required togenerate per sample.

Soybean is a multi-billion dollar commodity due to its balancedcomposition of 2:2:1 protein, starch, and oil by weight. Many seeds,including soybeans, contain proteins that are allergens andanti-nutritional factors. As such, there are concerns regarding thepotential of altering allergen levels in genetically-modified soybeanvarieties when compared to varieties developed through traditionalbreeding. The measurement of allergen levels in crops has been achievedalmost exclusively by immunoassays, such as enzyme-linked immunosorbentassays (ELISA) or IgE-immunoblotting; however, these methods suffer fromlimited sensitivity and specificity and high variability.

There has been recent interest in developing LC-MS/MS based methods toquantify several plant-expressed proteins in a single analysis. Analysisusing these “signature peptides” involves tracking protein expressionlevels by quantifying several highly specific digest fragments of theproteins of interest. This can be typically accomplished using liquidchromatography coupled with selected reaction monitoring (SRM) tandemmass spectrometry. Improved multiplexed LC-MS/MS methods and systems areprovided herein to enable simultaneous quantitation(s) of severalallergen proteins in transgenic and non-transgenic soybean. Methods andsystems provided herein are validated for analytical figures of meritincluding accuracy, precision, linearity, limits of detection andquantitation; and for other considerations including sample throughput,transferability, and ease of use. The allergens can be quantified usinga multiplexing format and samples can be harvested from the field,processed, and analyzed/quantitated for example within a day(twenty-four hours) window (from field to measured numerical value). Inaddition, sample preparations of the methods and systems provided can befully scalable for high-throughput, thus enabling hundreds of samples tobe analyzed in a single batch.

Representative soybean allergens include, for example, Gly m 1, Gly m 3,Gly m 4, Gly m 5 (beta-conglycinin), Gly m 6 (Glycinin) G1, Gly m 6(Glycinin) G2, Gly m 6 (Glycinin) G3, Gly m 6 (Glycinin) G4, Gly m 6(Glycinin) precursor, Gly m 6 (Glycinin) G4 precursor, Kunitz trypsininhibitor 1, Kunitz trypsin inhibitor 3, Gly m Bd 28 K, Gly m Bd 30 K,Gly m 8 (2S albumin), Lectin, and lipoxygenase.

Representative wheat allergens include, for example, profilin (Tri a12),wheat lipid transfer protein 1 (Tri a14), agglutinin isolectin 1 (Tria18), omega-5 gliadin-seed storage protein (Tri a19), gliadin (Tri a20;NCBI Accession Nos. M10092, M11073, M11074, M11075, M11076, K03074, andK03075), thioredoxin (Tri a25), high molecular weight glutenin (Tria26), low molecular weight glutenin (Tri a36), and alpha purothionin(Tri a37).

Representative corn allergens include, for example, maize lipid transferprotein (LTP) (Zea m14) and thioredoxin (Zea m25).

Representative corn allergens include, for example, rice profilin A (Orys12).

In some embodiments, the methods and systems provided use liquidchromatography coupled to tandem mass spectrometry (LC-MS/MS) to detectprotein expression levels of sixteen different allergens from soybean.In some embodiments, the methods and systems enable analysis of eachallergen by itself or combined with additional proteins for amultiplexing assay for qualitative and quantitative analysis in plantmatrices.

In some embodiments, the mass spectrometry detection for quantitativestudies may be accomplished using selected reaction monitoring,performed on a triple quadrupole mass spectrometer. Using this type ofinstrumentation, initial mass-selection of ion (peptide) of interestformed in the source, followed by, dissociation of this precursor ion inthe collision region of the MS, then mass-selection, and counting, of aspecific product (daughter) ion. In some embodiments, the massspectrometry detection for quantitative studies may be accomplishedusing selected reaction monitoring (SRM). Using particular type ofinstrumentation, initial mass-selection of ion of interest formed in thesource, followed by, dissociation of this precursor (protein) ion in thecollision region of the mass spectrometer (MS), then mass-selection, andcounting, of a specific product (peptide) ion. In some embodiment,counts per unit time may provide an integratable peak area from whichamounts or concentration of analytes can be determined. In someembodiment, the use of high resolution accurate mass (HRAM) monitoringfor quantitation, performed on a HRAM capable mass spectrometer, mayinclude, but is not limited to, hybrid quadrupole-time-of-flight,quadrupole-orbitrap, ion trap-orbitrap, or quadrupole-ion-trap-orbitrap(tribrid) mass spectrometers. Using particular type of instrumentation,peptides are not subject to fragmentation conditions, but rather aremeasured as intact peptides using full scan or targeted scan modes (forexample selective ion monitoring mode or SIM). Integratable peak areacan be determined by generating an extracted ion chromatogram for eachspecific analyte and amounts or concentration of analytes can becalculated. The high resolution and accurate mass nature of the dataenable highly specific and sensitive ion signals for the analyte(protein and/or peptide) of interest.

Unless otherwise stated, the following terms used in this application,including the specification and claims, have the definitions givenbelow. It must be noted that, as used in the specification and theappended claims, the singular forms “a,” “an,” and “the” include pluralreferents unless the context clearly dictates otherwise.

As used herein, the term “bioconfinement” refers to restriction of themovement of genetically modified plants or their genetic material todesignated areas. The term includes physical, physicochemical,biological confinement, as well as other forms of confinement thatprevent the survival, spread or reproduction of a genetically modifiedplants in the natural environment or in artificial growth conditions.

As used herein, the term “complex protein sample” is used to distinguisha sample from a purified protein sample. A complex protein samplecontains multiple proteins, and may additionally contain othercontaminants.

As used herein, the general term “mass spectrometry” or “MS” refers toany suitable mass spectrometry method, device or configurationincluding, e.g., electrospray ionization (ESI), matrix-assisted laserdesorption/ionization (MALDI) MS, MALDI-time of flight (TOF) MS,atmospheric pressure (AP) MALDI MS, vacuum MALDI MS, or combinationsthereof. Mass spectrometry devices measure the molecular mass of amolecule (as a function of the molecule's mass-to-charge ratio) bymeasuring the molecule's flight path through a set of magnetic andelectric fields. The mass-to-charge ratio is a physical quantity that iswidely used in the electrodynamics of charged particles. Themass-to-charge ratio of a particular peptide can be calculated, apriori, by one of skill in the art. Two particles with differentmass-to-charge ratio will not move in the same path in a vacuum whensubjected to the same electric and magnetic fields.

Mass spectrometry instruments consist of three modules: an ion source,which splits the sample molecules into ions; a mass analyzer, whichsorts the ions by their masses by applying electromagnetic fields; and adetector, which measures the value of an indicator quantity and thusprovides data for calculating the abundances of each ion present. Thetechnique has both qualitative and quantitative applications. Theseinclude identifying unknown compounds, determining the isotopiccomposition of elements in a molecule, determining the structure of acompound by observing its fragmentation, and quantifying the amount of acompound in a sample.

A detailed overview of mass spectrometry methodologies and devices canbe found in the following references which are hereby incorporated byreference: Can and Annan (1997) Overview of peptide and protein analysisby mass spectrometry. In: Current Protocols in Molecular Biology, editedby Ausubel, et al. New York: Wiley, p. 10.21.1-10.21.27; Paterson andAebersold (1995) Electrophoresis 16: 1791-1814; Patterson (1998) Proteinidentification and characterization by mass spectrometry. In: CurrentProtocols in Molecular Biology, edited by Ausubel, et al. New York:Wiley, p. 10.22.1-10.22.24; and Domon and Aebersold (2006) Science312(5771):212-17.

As the term is used herein, proteins and/or peptides are “multiplexed”when two or more proteins and/or peptides of interest are present in thesame sample.

As used herein, a “plant trait” may refer to any single feature orquantifiable measurement of a plant.

As used herein, the phrase “peptide” or peptides” may refer to shortpolymers formed from the linking, in a defined order, of α-amino acids.Peptides may also be generated by the digestion of polypeptides, forexample proteins, with a protease.

As used herein, the phrase “protein” or proteins” may refer to organiccompounds made of amino acids arranged in a linear chain and joinedtogether by peptide bonds between the carboxyl and amino groups ofadjacent amino acid residues. The sequence of amino acids in a proteinis defined by the sequence of a gene, which is encoded in the geneticcode. In general, the genetic code specifies 20 standard amino acids,however in certain organisms the genetic code can includeselenocysteine- and in certain archaea-pyrrolysine. The residues in aprotein are often observed to be chemically modified bypost-translational modification, which can happen either before theprotein is used in the cell, or as part of control mechanisms. Proteinresidues may also be modified by design, according to techniquesfamiliar to those of skill in the art. As used herein, the term“protein” encompasses linear chains comprising naturally occurring aminoacids, synthetic amino acids, modified amino acids, or combinations ofany or all of the above.

As used herein, the term “single injection” refers to the initial stepin the operation of a MS or LC-MS device. When a protein sample isintroduced into the device in a single injection, the entire sample isintroduced in a single step.

As used herein, the phrase “signature peptide” refers an identifier(short peptide) sequence of a specific protein. Any protein may containan average of between 10 and 100 signature peptides. Typically signaturepeptides have at least one of the following criteria: easily detected bymass spectroscopy, predictably and stably eluted from a liquidchromatography (LC) column, enriched by reversed phase high performanceliquid chromatography (RP-HPLC), good ionization, good fragmentation, orcombinations thereof. A peptide that is readily quantified by massspectrometry typically has at least one of the following criteria:readily synthesized, ability to be highly purified (>97%), soluble in≤20% acetonitrile, low non-specific binding, oxidation resistant,post-synthesis modification resistant, and a hydrophobicity orhydrophobicity index ≥10 and ≤40. The hydrophobicity index can becalculated according to Krokhin, Molecular and Cellular Proteomics 3(2004) 908, which is incorporated by reference. It's known that apeptide having a hydrophobicity index less than 10 or greater than 40may not be reproducibly resolved or eluted by a RP-HPLC column.

As used herein, the term “stacked” refers to the presence of multipleheterologous polynucleotides incorporated in the genome of a plant.

Tandem mass spectrometry: In tandem mass spectrometry, a parent iongenerated from a molecule of interest may be filtered in a massspectrometry instrument, and the parent ion subsequently fragmented toyield one or more daughter ions that are then analyzed (detected and/orquantified) in a second mass spectrometry procedure. In someembodiments, the use of tandem mass spectrometry is excluded. In theseembodiments, tandem mass spectrometry is not used in the methods andsystems provided. Thus, neither parent ions nor daughter ions aregenerated in these embodiments.

As used herein, the term “transgenic plant” includes reference to aplant which comprises within its genome a heterologous polynucleotide.Generally, the heterologous polynucleotide is stably integrated withinthe genome such that the polynucleotide is passed on to successivegenerations. The heterologous polynucleotide may be integrated into thegenome alone or as part of a recombinant expression cassette.“Transgenic” is used herein to include any cell, cell line, callus,tissue, plant part or plant, the genotype of which has been altered bythe presence of heterologous nucleic acid including those transgenicplants initially so altered as well as those created by sexual crossesor asexual propagation from the initial transgenic plant.

Any plants that provide useful plant parts may be treated in thepractice of the present invention. Examples include plants that provideflowers, fruits, vegetables, and grains.

As used herein, the phrase “plant” includes dicotyledonous plants andmonocotyledonous plants. Examples of dicotyledonous plants includetobacco, Arabidopsis, soybean, tomato, papaya, canola, sunflower,cotton, alfalfa, potato, grapevine, pigeon pea, pea, Brassica, chickpea,sugar beet, rapeseed, watermelon, melon, pepper, peanut, pumpkin,radish, spinach, squash, broccoli, cabbage, carrot, cauliflower, celery,Chinese cabbage, cucumber, eggplant, and lettuce. Examples ofmonocotyledonous plants include corn, rice, wheat, sugarcane, barley,rye, sorghum, orchids, bamboo, banana, cattails, lilies, oat, onion,millet, and triticale. Examples of fruit include banana, pineapple,oranges, grapes, grapefruit, watermelon, melon, apples, peaches, pears,kiwifruit, mango, nectarines, guava, persimmon, avocado, lemon, fig, andberries. Examples of flowers include baby's breath, carnation, dahlia,daffodil, geranium, gerbera, lily, orchid, peony, Queen Anne's lace,rose, snapdragon, or other cut-flowers or ornamental flowers,potted-flowers, and flower bulbs.

The specificity allowed in a mass spectrometry approach for identifyinga single protein from a complex sample is unique in that only thesequence of the protein of interest is required in order to identify theprotein of interest. Compared to other formats of multiplexing, massspectrometry is unique in being able to exploit the full length of aprotein's primary amino acid sequence to target unique identifier-typeportions of a protein's primary amino acid sequence to virtuallyeliminate non-specific detection. In some embodiments of the presentinvention, a proteolytic fragment or set of proteolytic fragments thatuniquely identifies a protein(s) of interest is used to detect theprotein(s) of interest in a complex protein sample.

In some embodiments, disclosed methods enable the quantification ordetermination of ratios of multiple proteins in a complex protein sampleby a single mass spectrometry analysis, as opposed to measuring eachprotein of interest individually multiple times and compiling theindividual results into one sample result.

In some embodiments, the present disclosure also provides methods usefulfor the development and use of transgenic plant technology.Specifically, disclosed methods may be used to maintain the genotype oftransgenic plants through successive generations. Also, some embodimentsof the methods disclosed herein may be used to provide high-throughputanalysis of non-transgenic plants that are at risk of being contaminatedwith transgenes from neighboring plants, for example, bycross-pollination. By these embodiments, bioconfinement of transgenesmay be facilitated and/or accomplished. In other embodiments, methodsdisclosed herein may be used to screen the results of a planttransformation procedure in a high-throughput manner to identifytransformants that exhibit desirable expression characteristics

The mass-to-charge ratio may be determined using a quadrupole analyzer.For example, in a “quadrupole” or “quadrupole ion trap” instrument, ionsin an oscillating radio frequency field experience a force proportionalto the DC potential applied between electrodes, the amplitude of the RFsignal, and m/z. The voltage and amplitude can be selected so that onlyions having a particular m/z travel the length of the quadrupole, whileall other ions are deflected. Thus, quadrupole instruments can act as a“mass filter” and “mass detector” for the ions injected into theinstrument.

Collision-induced dissociation (“CID”) is often used to generate thedaughter ions for further detection. In CID, parent ions gain energythrough collisions with an inert gas, such as argon, and subsequentlyfragmented by a process referred to as “unimolecular decomposition.”Sufficient energy must be deposited in the parent ion so that certainbonds within the ion can be broken due to increased energy.

The mass spectrometer typically provides the user with an ion scan; thatis, the relative abundance of each m/z over a given range (for example10 to 1200 amu). The results of an analyte assay, that is, a massspectrum, can be related to the amount of the analyte in the originalsample by numerous methods known in the art. For example, given thatsampling and analysis parameters are carefully controlled, the relativeabundance of a given ion can be compared to a table that converts thatrelative abundance to an absolute amount of the original molecule.Alternatively, molecular standards (e.g., internal standards andexternal standards) can be run with the samples and a standard curveconstructed based on ions generated from those standards. Using such astandard curve, the relative abundance of a given ion can be convertedinto an absolute amount of the original molecule. Numerous other methodsfor relating the presence or amount of an ion to the presence or amountof the original molecule are well known to those of ordinary skill inthe art.

The choice of ionization method can be determined based on the analyteto be measured, type of sample, the type of detector, the choice ofpositive versus negative mode, etc. Ions can be produced using a varietyof methods including, but not limited to, electron ionization, chemicalionization, fast atom bombardment, field desorption, and matrix-assistedlaser desorption ionization (MALDI), surface enhanced laser desorptionionization (SELDI), desorption electrospray ionization (DESI), photonionization, electrospray ionization, and inductively coupled plasma.Electrospray ionization refers to methods in which a solution is passedalong a short length of capillary tube, to the end of which is applied ahigh positive or negative electric potential. Solution reaching the endof the tube, is vaporized (nebulized) into a jet or spray of very smalldroplets of solution in solvent vapor. This mist of droplets flowsthrough an evaporation chamber which is heated to prevent condensationand to evaporate solvent. As the droplets get smaller the electricalsurface charge density increases until such time that the naturalrepulsion between like charges causes ions as well as neutral moleculesto be released.

The effluent of an LC may be injected directly and automatically (i.e.,“in-line”) into the electrospray device. In some embodiments, proteinscontained in an LC effluent are first ionized by electrospray into aparent ion.

Various different mass analyzers can be used in liquidchromatography—mass spectrometry combination (LC-MS). Exemplary massanalyzers include, but not limited to, single quadrupole, triplequadrupole, ion trap, TOF (time of flight), and quadrupole-time offlight (Q-TOF).

The quadrupole mass analyzer may consist of 4 circular rods, setparallel to each other. In a quadrupole mass spectrometer (QMS), thequadrupole is the component of the instrument responsible for filteringsample ions, based on their mass-to-charge ratio (m/z). Ions areseparated in a quadrupole based on the stability of their trajectoriesin the oscillating electric fields that are applied to the rods.

An ion trap is a combination of electric or magnetic fields thatcaptures ions in a region of a vacuum system or tube. Ion traps can beused in mass spectrometry while the ion's quantum state is manipulated.

Time-of-flight mass spectrometry (TOFMS) is a method of massspectrometry in which an ion's mass-to-charge ratio is determined via atime measurement. Ions are accelerated by an electric field of knownstrength. This acceleration results in an ion having the same kineticenergy as any other ion that has the same charge. The velocity of theion depends on the mass-to-charge ratio. The time that it subsequentlytakes for the particle to reach a detector at a known distance ismeasured. This time will depend on the mass-to-charge ratio of theparticle (heavier particles reach lower speeds). From this time and theknown experimental parameters one can find the mass-to-charge ratio ofthe ion.

In some embodiments, the particular instrument used by the methodsand/or systems provided may comprise a high fragmentation mode and a lowfragmentation mode (or alternatively a non-fragmentation mode). Suchdifferent modes may include alternating scan high and low energyacquisition methodology to generate high resolution mass data. In someembodiments, the high resolution mass data may comprise a product dataset (for example data derived from product ion (fragmented ions) underthe high fragmentation mode) and a precursor data set (for example dataderived from precursor ions (unfragmented ions) under the lowfragmentation or non-fragmentation mode).

In some embodiments, the methods and/or systems provided use a massspectrometer comprising a filtering device that may be used in theselection step, a fragmentation device that may be used in thefragmentation step, and/or one or more mass analyzers that may be usedin the acquisition and/or mass spectrum creation step or steps.

The filtering device and/or mass analyzer may comprise a quadrupole. Theselection step and/or acquisition step and/or mass spectrum creationstep or steps may involve the use of a resolving quadrupole.Additionally or alternatively, the filtering device may comprise a twodimensional or three dimensional ion trap or time-of-flight (ToF) massanalyzer. The mass analyzer or mass analyzers may comprise or furthercomprise one or more of a time-of-flight mass analyzer and/or an ioncyclotron resonance mass analyzer and/or an orbitrap mass analyzerand/or a two dimensional or three dimensional ion trap.

Filtering by means of selection based upon mass-to-charge ratio (m/z)can be achieved by using a mass analyzer which can select ions basedupon m/z, for example a quadrupole; or to transmit a wide m/z range,separate ions according to their m/z, and then select the ions ofinterest by means of their m/z value. An example of the latter would bea time-of-flight mass analyzer combined with a timed ion selector(s).The methods and/or systems provided may comprise isolating and/orseparating the one or more proteins of interest, for example from two ormore of a plurality of proteins, using a chromatographic technique forexample liquid chromatography (LC). The method may further comprisemeasuring an elution time for the protein of interest and/or comparingthe measured elution time with an expected elution time.

Additionally or alternatively, the proteins of interest may be separatedusing an ion mobility technique, which may be carried out using an ionmobility cell. Additionally, the proteins of interest may be selected byorder or time of ion mobility drift. The method may further comprisemeasuring a drift time for the proteins of interest and/or comparing themeasured drift time with an expected drift time.

In some embodiments, the methods and/or systems provided are label-free,where quantitation can be achieved by comparison of the peak intensity,or area under the mass spectral peak for the precursor or product m/zvalues of interest between injections and across samples. In someembodiments, internal standard normalization may be used to account forany known associated analytical error. Another label-free method ofquantification, spectral counting, involves summing the number offragment ion spectra, or scans, that are acquired for each givenpeptide, in a non-redundant or redundant fashion. The associated peptidemass spectra for each protein are then summed, providing a measure ofthe number of scans per protein with this being proportional to itsabundance. Comparison can then be made between samples/injections.

In some embodiments, the ion source is selected from the groupconsisting of: (1) an electrospray ionization (“ESI”) ion source; (2) anatmospheric pressure photo ionization (“APPI”) ion source; (3) anatmospheric pressure chemical ionization (“APCI”) ion source; (4) amatrix assisted laser desorption ionization (“MALDI”) ion source; (5) alaser desorption ionization (“LDI”) ion source; (6) an atmosphericpressure ionization (“API”) ion source; (7) a desorption ionization onsilicon (“DIOS”) ion source; (8) an electron impact (“EI”) ion source;(9) a chemical ionization (“CI”) ion source; (10) a field ionization(“FI”) ion source; (11) a field desorption (“FD”) ion source; (12) aninductively coupled plasma (“ICP”) ion source; (13) a fast atombombardment (“FAB”) ion source; (14) a liquid secondary ion massspectrometry (“LSIMS”) ion source; (15) a desorption electrosprayionization (“DESI”) ion source; (16) a nickel-63 radioactive ion source;(17) an atmospheric pressure matrix assisted laser desorption ionizationion source; and (18) a thermospray ion source.

In some embodiments, the methods and/or systems provided comprise anapparatus and/or control system configured to execute a computer programelement comprising computer readable program code means for causing aprocessor to execute a procedure to implement the methods.

In some embodiments, the methods and/or systems provided use analternating low and elevated energy scan function in combination withliquid chromatography separation of a plant extract. A list ofinformation for proteins of interest can be provided including, but isnot limited to, m/z of precursor ion, m/z of product ions, retentiontime, ion mobility drift time and rate of change of mobility. During thecourse of the LC separation and as the target ions elute into the massspectrometer (and as either low energy precursor ions, or elevatedenergy product ions are detected, or the retention time window isactivated) the mass analyzer of the methods and/or systems provided mayselect a narrow m/z range (of a variable and changeable width) to passions through to the gas cell. Accordingly, the signal to noise ratio canbe enhanced significantly for quantification of proteins of interest.

In some embodiments, at a chromatographic retention time when a targetedprotein of interest is about to elute into the mass spectrometer ionsource, the mass analyzer of the methods and/or systems provided canselect a narrow m/z range (of a variable and changeable width) accordingto the targeted precursor ion. These selected ions are then transferredto an instrument stage capable of dissociating the ions by means ofalternate and repeated switches between a high fragmentation mode wherethe sample precursor ions are substantially fragmented into product ionsand a low fragmentation mode (or non-fragmentation mode) where thesample precursor ions are not substantially fragmented. Typically highresolution, accurate mass spectra are acquired in both modes and at theend of the experiment associated precursor and product ions arerecognized by the closeness in fit of their chromatographic elutiontimes and optionally other physicochemical properties. The signalintensity of either the precursor ion or the product ion associated withtargeted proteins of interest can be used to determine the quantity ofthe proteins in the plant extract.

Those skilled in the art would understand certain variation can existbased on the disclosure provided. Thus, the following examples are givenfor the purpose of illustrating the invention and shall not be construedas being a limitation on the scope of the invention or claims.

EXAMPLES

Example 1

The methods and systems provided are used for determination ofendogenous soybean allergen proteins in soybean seed including Gly m 1,Gly m 3, Gly m 4, Gly m 5 (beta-conglycinin), Gly m 6, Kunitz trypsininhibitor 1, Kunitz trypsin inhibitor 3, Gly m Bd 28 K, Gly m Bd 30 K,and Gly m 8 (2S albumin). A 100±0.5 mg ground soybean seed sample isdefatted twice with hexanes and dried before extracting with extractionbuffer containing 5 M urea, 2 M thiourea, 50 mM Tris pH 8.0 and 65 mMDTT. The sample is sonicated in a water bath for thirty minutes,vortexed for one minute, sonicated for another thirty minutes andcentrifuged at >3,000 rpm for ten minutes at 4° C.

TABLE 1 Preparation of signature peptide calibration standards InitialVolume of Final concentration Dilution Volume of Std. concentration(ng/mL) Standard Cocktail (μL) (μL) (ng/mL) 5880.00 Std 12 — — 500.00500.00 Std 11 200 200 250.00 250.00 Std 10 200 200 125.00 125.00 Std 9200 200 62.50 62.50 Std 8 200 200 31.25 31.25 Std 7 200 200 15.63 15.63Std 6 200 200 7.81 7.81 Std 5 200 200 3.91 3.91 Std 4 200 200 1.95 1.95Std 3 200 200 0.98 0.98 Std 2 200 200 0.49 0.49 Std 1 2000 2000 0.24

The aqueous supernatant is collected and diluted to bring the endogenoussoybean allergen protein concentration into the calibration standardrange with extraction buffer. The diluted extract is denatured at 95° C.for twenty minutes with the additional 1 M Tris pH 8.0, 0.5 M DTT anddeionized water followed by refrigeration at 4 LC for ten minutes. Thedenatured extract is incubated overnight (˜15 hours) at 37° C. with 0.5mg/mL trypsin enzyme. The digestion reaction is quenched with formicacid water (50/50 v/v) and centrifuge at >3,000 rpm for tem minutes at 4TC. An aliquot of digested extract is transferred to an autosampler vialand analyzed along with calibration standard by liquid chromatographywith positive-ion electrospray (ESI) tandem mass spectrometry(LC-MS/MS). Calibration standards of signature peptides are prepared aslisted in Table 1.

The limits of detection (LOD) and limits of quantitation (LOQ) forendogenous soybean allergens in this example are set forth in Table 2,where LOD and LOQ represent protein concentration (ng/mg).

TABLE 2 Limits of detection (LOD) and limits ofquantitation (LOQ) for endogenous soybean allergens in Example 1(LOD and LOQrepresent protein concentration) LOD LOQ AllergenSignature peptide (ng/mg) (ng/mg) Gly m 1 SYPSNATCPR  0.23 0.46(SEQ ID NO: 1) Gly m 3 YMVIQGEPGAVIR  0.20 0.39 (SEQ ID NO: 2) Gly m 5NILEASYDTK  1.22 2.44 (SEQ ID NO: 3) Glycinin G2 VTAPAMR  1.46 2.92(SEQ ID NO: 4) Glycinin G3 NNNPFSFLVPPK  1.58 3.16 (SEQ ID NO: 5)Glycinin NGLHLPSYSPYPR  3.41 6.81 precursor (SEQ ID NO: 6)Kunitz trypsin GGGIEVDSTGK  — — inhibitor 1 (SEQ ID NO: 7)Kunitz trypsin GIGTLLSSPYR  — — inhibitor 3 (SEQ ID NO: 8) Gly m Bd 28 KNKPQFLAGAASLLR 5.70 11.40 (SEQ ID NO: 9) Gly m Bd 30 K GVITQVK  1.152.30 (SEQ ID NO: 10) Gly m 8 IMENQSEELEEK  0.25 0.50 (SEQ ID NO: 11)

Concentrations of allergens are calculated from quantitation ofsignature peptides (for example Analyst Bioanalytical software forLC-MS/MS), and validated by other methods including enzyme-linkedimmunosorbent assays (ELISA). Calculated concentrations of allergensfrom different samples are compared using statistical analysis, andresults show good consistency among samples.

Example 2

Several homologous protein sequences for Gly m 7 are identified frompublic databases including NCBI, Phytozome, and UniProt. Identifiedsequences (SEQ ID NOs: 12-15) are analyzed using bioinformatics tools toidentify sequence homology and shared sequence composition among theavailable protein sequences (see FIG. 14 ). Specifically this involvedthe use of Vector NTI Align X alignment tool which performs a CLUSTAL Wtype alignment. From this analysis, a consensus sequence and/orrepresentative sequence can be determined.

Once the consensus sequence and/or representative sequence is chosen ordetermined, it is digested in silico to generate candidate signaturepeptide fragments to be detected and measured by LC-MS. According to theunique approaches provided herein, signature peptides are selected basedon the degree of conservation among the available protein sequences,such that the selected signature peptide can be used to quantify all oras many protein isoforms as possible among the identified proteinsequences found in the public sequence databases. As a result,quantitation of selected signature peptides can not only measure Gly m 7itself, but also measure potential allergens which are highly homologousto Glyim 7.

Soybean seed samples are ground to a fine powder, defatted twice withhexane, and extracted with suitable assay buffer (for example 5 M urea,2 M thiourea, 50 mM Tris (pH 8.0), 65 mM DTT). The samples are sonicatedin buffer to extract proteins. The extracted proteins are diluted,denatured, and then proteolytically digested by adding trypsin proteaseand incubating at 37° C. for 15-20 hours. The digestion reactions areacidified with formic acid (pH=1-2) and are analyzed using LC-MS/MS.

The selected signature peptides can be used for both qualitative andquantitative analysis of Gly m 7, either by itself or in combinationwith additional proteins in a multiplexing assay format. In thisexample, twelve signature peptides are selected from all peptidepossibilities (SEQ ID NO: 32 AAELASMSAGAVK; SEQ ID NO: 33 AMGDIGGR; SEQID NO: 34 DTPQGSIEALQAGER, SEQ ID NO: 35 DYTLQAAEK, SEQ ID NO: 36GLAASAGETAK, SEQ ID NO: 37 QSWLETR, SEQ ID NO: 38 SAAGYAAK, SEQ ID NO:39 SAGGTTASYVGEK, SEQ ID NO: 40 SAWEQISNYSDQATQGVK, SEQ ID NO: 41SLTSIGEK, SEQ ID NO: 42 TTAVITCTLEK, and SEQ ID NO: 43 VAADLR), andrepresentative quantitation of these signature peptides are shown inFIGS. 2-13 . Synthetic peptides can directly serve as an analyticalreference standard for protein quantitation.

1-22. (canceled)
 22. A system for quantitating one or more protein ofinterest with known amino acid sequence in a plant-based sample, thesystem comprising: (a) a high-throughput means for extracting proteinsfrom a plant-based sample; (b) a process module for digesting extractedproteins with at least one protease; (c) a separation module forseparating peptides in a single step; (d) a selection module forselecting a plurality of signature peptides for at least one known Gly m7 allergen and potential allergens of SEQ ID NO:13-16; and (e) a massspectrometry for measuring the plurality of signature peptides of SEQ IDNO:32-43.
 23. The system of claim 22, wherein the separation modulecomprises a column chromatography.
 24. The system of claim 23, whereinthe column chromatography comprises a liquid column chromatography. 25.The system of claim 22, wherein the mass spectrometry comprises a highresolution accurate mass spectrometry (RAM MS).
 26. The system of claim22, wherein the plant-based sample comprises a soybean seed or part of asoybean seed.
 27. A high-throughput method of quantitating at least oneallergen with known amino acid sequence and homologous potentialallergens in a plant-based sample, comprising using the system of claim22.